A Quad-Tree Based Sparse BLAS Implementation for Shared Memory Parallel Computers

نویسندگان

  • Michele Martone
  • Joseph Weizenbaum
چکیده

" Ladies and gentleman, this is your captain speaking. I have some good news and I have some bad news. The good news is, that we have a very strong tail wind, and we are doing one thousand four hundred miles per hour over land. The bad news is, that all of our navigation instruments are out, and we don't know where we are, and we don't know where we are going. " Joseph Weizenbaum " Rebel at Work " , a documentary film about B Appendix: patterns of indirect memory access, with stride 167 C Appendix: some more experiments with RSB 175 C.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Algebra calculations on a Virtual Shared Memory Computer

We evaluate the impact of the memory hierarchy of virtual shared memory computers on the design of algorithms for linear algebra. On classical shared memory multiprocessor computers, block algorithms are used for e ciency. We study here the potential and the limitations of such approaches on globally addressable distributed memory computers. The BBN TC2000 belongs to this class of computers and...

متن کامل

Extending PSBLAS to Build Parallel Schwarz Preconditioners

We describe some extensions to Parallel Sparse BLAS (PSBLAS), a library of routines providing basic Linear Algebra operations needed to build iterative sparse linear system solvers on distributed-memory parallel computers. We focus on the implementation of parallel Additive Schwarz preconditioners, widely used in the solution of linear systems arising from a variety of applications. We report a...

متن کامل

Use of Linear Algebra Kernels to Build an Efficient Finite Element Solver

For scientific codes to achieve good performance on computers with hierarchical memories, it is necessary that the ratio of memory references to arithmetic operations be low. In this paper, we show that Level 3 BLAS linear algebra kernels can be used to satisfy this requirement to produce an efficient implementation of a parallel finite element solver on a shared memory parallel computer with a...

متن کامل

Parallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers

This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...

متن کامل

Solving Unsymmetric Sparse Systems of Linear Equations with PARDISO

Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to extend these concepts further and unsymmetric prepermutation of rows is used to place large matri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011